Large-scale parallel data clustering

نویسندگان

Dan Judd

Philip K. McKinley

Anil K. Jain

چکیده

Algorithmic enhancements are described that enable large computational reduction in mean square-error data clustering. These improvements are incorporated into a parallel data-clustering tool, P-CLUSTER, designed to execute on a network of workstations. Experiments involving the unsupervised segmentation of standard texture images were performed. For some data sets, a 96 percent reduction in computation was achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی داده‌ها بر پایه شناسایی کلید

Clustering has been one of the main building blocks in the fields of machine learning and computer vision. Given a pair-wise distance measure, it is challenging to find a proper way to identify a subset of representative exemplars and its associated cluster structures. Recent trend on big data analysis poses a more demanding requirement on new clustering algorithm to be both scalable and accura...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...

متن کامل

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

متن کامل

Parallel K-Means Clustering Based on MapReduce

Data clustering has been received considerable attention in many applications, such as data mining, document retrieval, image segmentation and pattern classification. The enlarging volumes of information emerging by the progress of technology, makes clustering of very large scale of data a challenging task. In order to deal with the problem, many researchers try to design efficient parallel clu...

متن کامل

تجمع بیماری در مقیاسی وسیع و کاربرد آن در مطالعات اپیدمیولوژی و بهداشت

Spatial autocorrelation statistics provide summary information about the spatial arrangement of data in a map. In fact, these statistics compare neighboring area values in order to assess the level of large scale clustering. Whenever a large number of neighboring areas have either relatively large or relatively small values, large scale clustering may be detected. Detecting such clustering is a...

متن کامل

Parallel D2-Clustering: Large-Scale Clustering of Discrete Distributions

The discrete distribution clustering algorithm, namely D2-clustering, has demonstrated its usefulness in image classification and annotation where each object is represented by a bag of weighed vectors. The high computational complexity of the algorithm, however, limits its applications to large-scale problems. We present a parallel D2-clustering algorithm with substantially improved scalabilit...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1996

Large-scale parallel data clustering

نویسندگان

چکیده

منابع مشابه

خوشه‌بندی داده‌ها بر پایه شناسایی کلید

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

High Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation

Parallel K-Means Clustering Based on MapReduce

تجمع بیماری در مقیاسی وسیع و کاربرد آن در مطالعات اپیدمیولوژی و بهداشت

Parallel D2-Clustering: Large-Scale Clustering of Discrete Distributions

عنوان ژورنال:

اشتراک گذاری